Rapid Adaptation of NE Resolvers for Humanities Domains using Active Annotation
نویسندگان
چکیده
The entities mentioned in collections of scholarly articles in the Humanities (and in other scholarly domains) belong to different types from those familiar from news corpora, hence new resources need to be annotated to create supervised taggers for tasks such as ne extraction. However, in such domains there is a great need for making the best use possible of the annotators. One technique designed for this purpose is active annotation. We discuss our use of active annotation for annotating corpora of articles about Archaeology in the Portale della Ricerca Umanistica Trentina.
منابع مشابه
Linguistic Issues in Language Technology – LiLT
This contribution investigates novel techniques for error detection in automatic semantic annotations, as an attempt to reconcile error-prone NLP processing with high quality standards required for empirical research in Digital Humanities. We demonstrate the state-of-the-art performance of semantic NLP systems on a corpus of ritual texts and report performance gains we obtain using domain adapt...
متن کاملCombining Active Learning and Partial Annotation for Japanese Dependency Parsing
The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows how active learning can be used for domain adaptation of d...
متن کاملCombining Active Learning and Partial Annotation for Domain Adaptation of a Japanese Dependency Parser
The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows that active learning can be used for domain adaptation of ...
متن کاملAnnotating Archaeological Texts: An Example of Domain-Specific Annotation in the Humanities
Developing content extraction methods for Humanities domains raises a number of challenges, from the abundance of non-standard entity types to their complexity to the scarcity of data. Close collaboration with Humanities scholars is essential to address these challenges. We discuss an annotation schema for Archaeological texts developed in collaboration with domain experts. Its development requ...
متن کاملAVATecH: Audio/Video Technology for Humanities Research
In the AVATecH project the Max-Planck Institute for Psycholinguistics (MPI) and the Fraunhofer institutes HHI and IAIS aim to significantly speed up the process of creating annotations of audio-visual data for humanities research. For this we integrate state-of-theart audio and video pattern recognition algorithms into the widely used ELAN annotation tool. To address the problem of heterogeneou...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JLCL
دوره 26 شماره
صفحات -
تاریخ انتشار 2011